Automated Generalization of Translation Examples

نویسنده

  • Ralf D. Brown
چکیده

Previous work has shown that adding generalization of the examples in the corpus of an example-based machine translation (EBMT) system can reduce the required amount of pretranslated example text by as much as an order of magnitude for Spanish-English and FrenchEnglish EBMT. Using word clustering to automatically generalize the example corpus can provide the majority of this improvement for French-English with no manual intervention; the prior work required a large bilingual dictionary tagged with parts of speech and the manual creation of grammar rules. By seeding the clustering with a small amount of manuallycreated information, even better performance can be achieved. This paper describes a method whereby bilingual word clustering can be performed using standard monolingual document clustering techniques, and its e ectiveness at reducing the size of the example corpus required.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GENERALIZATION OF TITCHMARSH'S THEOREM FOR THE GENERALIZED FOURIER-BESSEL TRANSFORM

In this paper, using a generalized translation operator, we prove theestimates for the generalized Fourier-Bessel transform in the space L2 on certainclasses of functions.

متن کامل

GENERALIZATION OF TITCHMARSH'S THEOREM FOR THE DUNKL TRANSFORM IN THE SPACE $L^P(R)$

In this paper‎, ‎using a generalized Dunkl translation operator‎, ‎we obtain a generalization of Titchmarsh's Theorem for the Dunkl transform for functions satisfying the$(psi,p)$-Lipschitz Dunkl condition in the space $mathrm{L}_{p,alpha}=mathrm{L}^{p}(mathbb{R},|x|^{2alpha+1}dx)$‎, ‎where $alpha>-frac{1}{2}$.  

متن کامل

Learning Translation Templates with Type Constraints

This paper presents a generalization technique that induces translation templates from given translation examples by replacing differing parts in these examples with typed variables. Since the type of each variable is also inferred during the learning process, each induced template is associated with a set of type constraints. The type constraints that are associated with a translation template...

متن کامل

Improving Example Based Machine Translation Through Morphological Generalization and Adaptation

Example Based Machine Translation (EBMT) is limited by the quantity and scope of its training data. Even with a reasonably large corpus, we will not have examples that cover everything we want to translate. This problem is especially severe in Arabic due to its rich morphology. We demonstrate a novel method that exploits the regular nature of Arabic morphology to increase the quality and covera...

متن کامل

Arabic-to-English Example Based Machine Translation Using Context-Insensitive Morphological Analysis

W e describe and discuss the results of ongoing experim ents that use morphological analysis in the context of Example-Based M achine Translation. The goal is to increase the coverage of our training examples so as to capture things that are not directly seen in the training text. This is done through a two stage process of generalization and filtering.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000